The Human Development Index (HDI) emphasizes on capabilities of people and uses it as a criteria for assessing the development of a country (“Human Development Index (HDI) | Human Development Reports”, 2021). The data used here, is extracted from United Nations Development Programme : Human Development Reports.
The data set shows indicative data for countries with very high, high, moderate and low human development. It compares the countries for HDI value, HDI rank(2019, 2018)and SDGs (Sustainable Development Goals) 3,4,5 (2019), where SDG 3 = Life expectancy at birth, SDG 4.3 = Expected years of schooling, SDG 4.4 = Mean years of schooling, SDG 8.5 = Gross national income (GNI) per capita.
The objective of the report is to analyze the data set used, by answering four research questions.
Figure 2.1: Number of Countries in Each Degree of Human Development
Figure 2.2: Summary Plot
Figure 2.2 shows us the position of minimum values, maximum values, Quartile 1, Quartile 2 or median, Quartile 3, and Outlier data values.
Minimum: The line under the box represents the lowest value in the data for that particular variable.
Maximum: The line above the box represents the maximum value in the data for that particular variable.
Quartile: The box represents 50% of data points between the 1st and 3rd quartiles.
Quartile 2 or Median: The vertical line inside each box represents the median value of the data set for that particular variable.
Outlier Data: The outlier data only exist in variable Expected year of schooling and GNI per capita.
Similar statistics are explained in detail below:
| Variable | Degree_of_Human_Development | Minimum | Median | Mean | Maximum | SD |
|---|---|---|---|---|---|---|
| Expected_years_of_schooling | VERY HIGH HUMAN DEVELOPMENT | 12.04 | 16.12 | 16.14 | 21.95 | 1.82 |
| Expected_years_of_schooling | HIGH HUMAN DEVELOPMENT | 11.19 | 13.61 | 13.60 | 16.87 | 1.14 |
| Expected_years_of_schooling | MEDIUM HUMAN DEVELOPMENT | 8.28 | 11.60 | 11.50 | 13.72 | 1.13 |
| Expected_years_of_schooling | LOW HUMAN DEVELOPMENT | 5.01 | 9.70 | 9.30 | 12.66 | 1.87 |
| GNI_per_capita | VERY HIGH HUMAN DEVELOPMENT | 14428.80 | 39870.68 | 42929.79 | 131031.59 | 20854.39 |
| GNI_per_capita | HIGH HUMAN DEVELOPMENT | 5039.04 | 13009.07 | 13184.34 | 26903.25 | 4763.56 |
| GNI_per_capita | MEDIUM HUMAN DEVELOPMENT | 2253.35 | 4960.53 | 5694.22 | 13944.13 | 2682.41 |
| GNI_per_capita | LOW HUMAN DEVELOPMENT | 753.91 | 2132.96 | 2385.03 | 5689.35 | 1284.51 |
| HDI_Value | VERY HIGH HUMAN DEVELOPMENT | 0.80 | 0.88 | 0.88 | 0.96 | 0.05 |
| HDI_Value | HIGH HUMAN DEVELOPMENT | 0.70 | 0.74 | 0.75 | 0.80 | 0.03 |
| HDI_Value | MEDIUM HUMAN DEVELOPMENT | 0.55 | 0.61 | 0.62 | 0.70 | 0.04 |
| HDI_Value | LOW HUMAN DEVELOPMENT | 0.39 | 0.48 | 0.49 | 0.55 | 0.05 |
| Life_expectancy | VERY HIGH HUMAN DEVELOPMENT | 72.58 | 80.21 | 79.45 | 84.86 | 3.29 |
| Life_expectancy | HIGH HUMAN DEVELOPMENT | 64.13 | 74.25 | 74.00 | 78.93 | 3.27 |
| Life_expectancy | MEDIUM HUMAN DEVELOPMENT | 58.74 | 69.66 | 68.43 | 76.68 | 4.72 |
| Life_expectancy | LOW HUMAN DEVELOPMENT | 53.28 | 62.05 | 61.95 | 69.02 | 4.37 |
| Mean_years_of_schooling | VERY HIGH HUMAN DEVELOPMENT | 7.28 | 12.14 | 11.62 | 14.15 | 1.47 |
| Mean_years_of_schooling | HIGH HUMAN DEVELOPMENT | 7.02 | 9.39 | 9.47 | 11.81 | 1.26 |
| Mean_years_of_schooling | MEDIUM HUMAN DEVELOPMENT | 4.07 | 6.50 | 6.51 | 11.10 | 1.52 |
| Mean_years_of_schooling | LOW HUMAN DEVELOPMENT | 1.64 | 3.93 | 4.25 | 6.76 | 1.37 |
Table 2.1 shows the mean, median, minimum, maximum and standard deviation values for each SDG mentioned above and for the HDI value too, with respect to different degrees of Human Development Index.
In this section, we analyze the economic sector for both: Very high human development countries and Medium human development countries.
The analysis is based on the GNI per capita given in the data set.
The aim is to find out how the GNI per capita would impact the HDI value between the two groups (Very high and medium human development countries), and whether the very high human development countries translate their income better than the medium development countries.
Figure 2.3: GNI distribution for very high and medium development group
The above figure 2.3 shows that the distribution for both groups are skewed. Hence, we use the Log-transformation on GNI per capita for further analysis.
Figure 2.4: GNI distribution for very high and medium development group
Figure 2.4 shows the distribution of GNI per capita in logarithm.
Here, it is clear that both group’s GNI values display normal distributions.
We can see that very high human development is significantly higher than medium development countries in GNI values, and only a small portion of countries from the medium development group overlaps with the other group. This indicates that higher GNI may result a higher development county.
| Degree_of_Human_Development | min | q1 | median | q3 | max | mean | sd | n |
|---|---|---|---|---|---|---|---|---|
| MEDIUM HUMAN DEVELOPMENT | 7.7 | 8.3 | 8.5 | 8.9 | 9.5 | 8.6 | 0.4 | 37 |
| VERY HIGH HUMAN DEVELOPMENT | 9.6 | 10.2 | 10.6 | 10.9 | 11.8 | 10.6 | 0.5 | 66 |
Table 2.2 shows the summary statistics for GNI_per_capita log values where ‘n’ represents count of countries, falling under the given degree of human development.
Figure 2.5: No. of Countries in each degree of HDI V/S GNI per capita
By comparing the summary statistics in table 2.2 with the figure 2.5, it is clear to see that the GNI per capita in the very high human development group is overall higher than the medium human development. This may suggest that higher income will result in a high human development group, which means a higher HDI value.
Figure 2.6: HDI Value V/S GNI per capita for Very High and Medium HDI Group
In Figure 2.6, we compute linear models between GNI and HDI values for both group.
We can see that the regression line for the very high HDI group is above the medium HDI group. However, as the GNI values for both group do not overlap with each other, we generate the model summaries for both group as follows:
| Log of GNI per capita | |||
|---|---|---|---|
| Predictors | Estimates | CI | p |
| (Intercept) | 4.67 | 3.26 – 6.07 | <0.001 |
| HDI_Value | 6.71 | 5.11 – 8.30 | <0.001 |
| Observations | 66 | ||
| R2 / R2 adjusted | 0.525 / 0.517 | ||
The above table shows a slop of 6.7067 for Very High Human Development Group Category.
| Log of GNI per capita | |||
|---|---|---|---|
| Predictors | Estimates | CI | p |
| (Intercept) | 4.84 | 3.02 – 6.65 | <0.001 |
| HDI_Value | 6.01 | 3.08 – 8.94 | <0.001 |
| Observations | 37 | ||
| R2 / R2 adjusted | 0.331 / 0.312 | ||
The above table shows a slop of 6.0107 for Medium Human Development Group Category.
Now compairing with Diagnostic plots:
Figure 2.7: Diagnostic Plots for Linear Models
In figure 2.7, the left side diagnostic plot shows predicted values for very high HDI group and right side diagnostic plot shows medium HDI group.
| r.squared | adj.r.squared | sigma | statistic | p.value | df | logLik | AIC | BIC | deviance | df.residual | nobs |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0.5247895 | 0.5173643 | 0.3244632 | 70.67715 | 0 | 1 | -18.34599 | 42.69197 | 49.26094 | 6.737687 | 64 | 66 |
| r.squared | adj.r.squared | sigma | statistic | p.value | df | logLik | AIC | BIC | deviance | df.residual | nobs |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0.3311233 | 0.3120126 | 0.3627941 | 17.32654 | 0.0001946 | 1 | -13.95765 | 33.91531 | 38.74806 | 4.606685 | 35 | 37 |
By checking the diagnostic plots of the linear model in 2.7, we can see both model are normal distributions and the model do provide a goodness of fit to the data.
However, if we look at the r-square value in 2.3 and 2.4, both model shows a low coefficient of determination. This suggests that GNI is not the only fundamental variables that determines the HDI value.
Therefore, we can conclude that although the very high HDI group performs better in both GNI and HDI than medium HDI group, there is only a medium correlation between GNI and HDI values to show that very high HDI translates income better.
Figure 2.8: Gap between the ‘expected years of education’ and the ‘average years of education’, for different levels of Human Development
The figure 2.8 indicates that the residual between the expected years of education and the average years of education in some countries with high human development is larger than that in countries with low human development (Kovacevic, 2010).
| Country | Degree_of_Human_Development | residual_years_of_schooling |
|---|---|---|
| Australia | VERY HIGH HUMAN DEVELOPMENT | 9.229639 |
| Bhutan | MEDIUM HUMAN DEVELOPMENT | 8.908858 |
| Benin | LOW HUMAN DEVELOPMENT | 8.788222 |
| Turkey | VERY HIGH HUMAN DEVELOPMENT | 8.495846 |
| Morocco | MEDIUM HUMAN DEVELOPMENT | 8.073170 |
| Uruguay | VERY HIGH HUMAN DEVELOPMENT | 7.909910 |
| Tunisia | HIGH HUMAN DEVELOPMENT | 7.905390 |
| Grenada | HIGH HUMAN DEVELOPMENT | 7.837766 |
| Timor-Leste | MEDIUM HUMAN DEVELOPMENT | 7.821715 |
| Burundi | LOW HUMAN DEVELOPMENT | 7.781347 |
| Nepal | MEDIUM HUMAN DEVELOPMENT | 7.742130 |
| Belgium | VERY HIGH HUMAN DEVELOPMENT | 7.724110 |
The table above 2.5 shows the specific countries information that have the 12 highest residual (Gap between the ‘expected years of education’ and the ‘average years of education’) values.
| Degree_of_Human_Development | Minimum RYS | Maximum RYS | Median RYS |
|---|---|---|---|
| VERY HIGH HUMAN DEVELOPMENT | 1.462530 | 9.229639 | 4.112699 |
| HIGH HUMAN DEVELOPMENT | -0.174090 | 7.905390 | 4.157615 |
| MEDIUM HUMAN DEVELOPMENT | 0.928100 | 8.908858 | 4.987901 |
| LOW HUMAN DEVELOPMENT | 0.496258 | 8.788222 | 5.108076 |
However, table 2.6 shows that the Residual Years of schooling (average value of the difference between the ‘expected years of education’ and the ‘average years of education’), with a ‘high degree of human development’, is smaller, which contradicts the previous observations.
Figure 2.9: Detals about medium residual years of schooling for different Degree of Human Development
By figure 2.9 we further observe that because there are more countries in ‘high HDI group’ than ‘low HDI group’, and only a small part of ‘high HDI group’ have excessive differences.
Although countries in ‘low HDI group’ has lower ‘maximum difference value’, the large difference in most countries leads to a large average.
Therefore, it is accurate to use the ‘average value of expected years of education’ and ‘average years of education’ for the calculation of HDI.
There is a difference in HDI ranks for some countries from 2018 to 2019. The value below shows the number of countries experiencing different HDI values in both years.
| No. of countries with rank difference |
|---|
| 112 |
The above table 2.7 shows the total number of countries which have different HDI rank for the 2018 and 2019.
Figure 2.10: Rank difference v/s Countries
Figure 2.10 represents a relationship: ‘Difference in 2019 HDI rank and 2018 HDI rank’ v/s ‘Countries of different HDI Groups’.
We observe that there are more countries that have experienced a decline in rank from 2018 to 2019 (negative values on graph), than the countries whose rank has gone up.
Out of countries that have experienced a decline in rank from 2018 to 2019, higher ‘negative values’ are observed for ‘high HDI group’.
Out of countries that have experienced an increase in rank from 2018 to 2019, maximum increase is observed among ‘high HDI group’ as well.
| No. of countries with the same rank |
|---|
| 77 |
The above table 2.8 shows the total number of countries which have the same HDI rank for the 2018 and 2019.
Figure 2.11: Countries with same HDI rank for 2018 and 2019
The above figure 2.11, gives the names of countries which have maintained their rank form 2018 to 2019 (no change in rank).
Most countries fall under Very High Human Development and few under Low Human Development.
HDI value, life expectancy, and mean year of schooling have more values as compared to ‘expected year of schooling’ and GNI. Also outlier data only exist in variable Expected year of schooling and GNI per capita.
Higher income will result in a high human development group, which means a higher HDI value.
Very High HDI Group translates income better.
It is accurate to use the ‘average value of expected years of education and average years of education’ for the calculation of HDI.
There are 122 countries that have a different HDI rank for 2018 and 2019. Out of which, more countries have had a decline in rank from 2018 to 2019, than an increase.
There are 77 countries that have the same rank for 2018 and 2019, out of which, Norway has maintained its 1st rank overall.
[1]Wickham et al., (2019). Welcome to the tidyverse. Journal of Open Source Software, 4(43), 1686, https://doi.org/10.21105/joss.01686
[2] Hadley Wickham and Jim Hester (2020). readr: Read Rectangular Text Data. R package version 1.4.0. https://CRAN.R-project.org/package=readr
[3] Hadley Wickham and Evan Miller (2020). haven: Import and Export ‘SPSS’, ‘Stata’ and ‘SAS’ Files. R package version 2.3.1. https://CRAN.R-project.org/package=haven
[4] Hao Zhu (2021). kableExtra: Construct Complex Table with ‘kable’ and Pipe Syntax. R package version 1.3.4. https://CRAN.R-project.org/package=kableExtra
[5] R Core Team (2021). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.
[6] Frank E Harrell Jr, with contributions from Charles Dupont and many others. (2021). Hmisc: Harrell Miscellaneous. R package version 4.5-0. https://CRAN.R-project.org/package=Hmisc
[7] Baptiste Auguie (2017). gridExtra: Miscellaneous Functions for “Grid” Graphics. R package version 2.3. https://CRAN.R-project.org/package=gridExtra
[8] Katherine Goode and Kathleen Rey (2019). ggResidpanel: Panels and Interactive Versions of Diagnostic Plots using ‘ggplot2’. R package version 0.3.0. https://CRAN.R-project.org/package=ggResidpanel
[9] Claus O. Wilke (2020). cowplot: Streamlined Plot Theme and Plot Annotations for ‘ggplot2’. R package version 1.1.1. https://CRAN.R-project.org/package=cowplot
[10] Lüdecke D (2021). sjPlot: Data Visualization for Statistics in Social Science. R package version 2.8.7, <URL: https://CRAN.R-project.org/package=sjPlot>.
[11] Silge J, Robinson D (2016). “tidytext: Text Mining and Analysis Using Tidy Data Principles in R.” JOSS, 1(3). doi: 10.21105/joss.00037 (URL: https://doi.org/10.21105/joss.00037), <URL: http://dx.doi.org/10.21105/joss.00037>.
[12] Kovacevic, M. (2010). Review of HDI Critiques and Potential Improvements. Human Development Research Paper, 2010/33. Retrieved 24 May 2021, from https://www.researchgate.net/publication/235945302_Review_of_HDI_Critiques_and_Potential_Improvements_Human_Development_Research_Paper_201033
[13] Human Development Index (HDI) | Human Development Reports. (2021). Retrieved 24 May 2021, from http://hdr.undp.org/en/content/human-development-index-hdi